Efficient reinforcement learning: model-based Acrobot control

نویسنده

  • Gary Boone
چکیده

|Several methods have been proposed in the reinforcement learning literature for learning optimal policies for sequential decision tasks. Q-learning is a model-free algorithm that has recently been applied to the Acrobot, a two-link arm with a single actuator at the elbow that learns to swing its free endpoint above a target height. However, applying Q-learning to a real Acrobot may be impractical due to the large number of required movements of the real robot as the controller learns. This paper explores the planning speed and data eeciency of explicitly learning models, as well as using heuristic knowledge to aid the search for solutions and reduce the amount of data required from the real robot.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

High-accuracy value-function approximation with neural networks applied to the acrobot

Several reinforcement-learning techniques have already been applied to the Acrobot control problem, using linear function approximators to estimate the value function. In this paper, we present experimental results obtained by using a feedforward neural network instead. The learning algorithm used was model-based continuous TD(λ). It generated an efficient controller, producing a high-accuracy ...

متن کامل

Application of reinforcement learning to balancing of acrobot

The acrobot is a two-link robot, actuated only at the joint between the two links. It is one of dicult tasks in reinforcement learning (RL) to control the acrobot because it has nonlinear dynamics and continuous state and action spaces. In this article, we discuss applying the RL to the task of balancing control of the acrobot. Our RL method has an architecture similar to the actor-critic. The ...

متن کامل

Using BELBIC based optimal controller for omni-directional threewheel robots model identified by LOLIMOT

In this paper, an intelligent controller is applied to control omni-directional robots motion. First, the dynamics of the three wheel robots, as a nonlinear plant with considerable uncertainties, is identified using an efficient algorithm of training, named LoLiMoT. Then, an intelligent controller based on brain emotional learning algorithm is applied to the identified model. This emotional l...

متن کامل

Representation Discovery for Kernel-Based Reinforcement Learning

Recent years have seen increased interest in non-parametric reinforcement learning. There are now practical kernel-based algorithms for approximating value functions; however, kernel regression requires that the underlying function being approximated be smooth on its domain. Few problems of interest satisfy this requirement in their natural representation. In this paper we define value-consiste...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997